Using R to Predict Antimicrobial Resistance Rates.

Analysis performed using antimicrobial susceptibility test data for all isolates from blood cultures collected at NUTH from Q1 2019 onwards.

Daniel Weiand

Newcastle upon Tyne Hospitals NHS Foundation Trust

Friday, 30 June, 2023

Session aims

  1. Introduce R

  2. Point out useful learning resources

  3. Run through a working example of a project completed (in its entirety) using R

What is R?

  • R is one of the most commonly used languages for data science, together with Python.

  • R is a powerful, free open source data science and statistics environment, used in industry, academia and major corporations (eg Microsoft, Google, Facebook).

  • R benefits from a worldwide community that freely shares learning and resources, through e.g. GitHub

Let’s start with the elephant in the room

Yes, to use R, we need to learn some code

No, it’s not rocket science.

Learn to code copy-paste

Why use R for data science?

  • Data has transformed our world in powerful ways and can help us make better decisions.

  • Almost every interaction with the health service leaves a digital trace - raw information that has phenomenal potential.

  • But raw data is not powerful on its own. It must be shaped, checked, curated and analysed. And then it must be communicated, and acted upon. This work requires people, with modern data skills, in teams, using platforms like R to do the heavy lifting and avoid needless duplication of effort.

  • The Goldacre report actively promotes the use of R in the NHS.

  • NUTH now actively supports the use of R at scale, and it can be installed on any work PC (simply call IT and ask to be added the “SCCM-R” group)

Many of you are already familiar with R

NHS-R

  • The Health Foundation supports NHS-R, which delivers free-to-NHS-staff online training.

  • It’s free to register.

  • Courses are really popular and spaces are limited to about 20 per session. Sessions are scheduled once a month. To be notified when further dates are scheduled, please contact: nhs.rcommunity@nhs.net

  • NHS-R runs the premier data science conference in the NHS, along with regular skill-based webinars.

  • NHS-R supports a thriving Slack community

Other useful resources

R helps maintain momentum and promotes collaboratoration

We know that projects can quickly lose momentum and get stuck

R can beat mission creep

“Just one more subanalysis”

Think: Re-running reports after filtering the data by various substrata.

R easily handles big (and growing) data sets

“Just a bit more data”

Think: Re-running reports when more data becomes available.

R can help you get to the end

“Just (one more… last… final!) final report”

Think: Re-drafting the same report multiple times to get your paper accepted through peer review

R supports reproducible research and QI

This presentation was written in R

And these papers were written entirely in R

R and python are quickly overtaking SAS, SPSS and Stata

  • In terms of jobs

  • In terms of output

  • In terms of impact

R and python are quickly overtaking SAS, SPSS and Stata

  • In terms of Google scholar hits

  • In terms of Stack Overflow queries

Avoiding excel has its benefits

So now for a working example

  • Aim: Predict Antimicrobial Resistance (AMR) Rates for Blood Culture Isolates at NUTH, using R

  • Objectives:

    • Import blood culture data into R

    • Wrangle, visualise, and exploring data using R

    • Analyse historical AMR rates, and model future AMR rates using R

Methods

  • The LIMS was interrogated to collect data on all culture-positive blood cultures collected between 2019-04-01 and 2023-03-31

The AMR package

  • The AMR package [1,2] is a free, open-source and independent package for R [3] that provides a standard for clean and reproducible analysis and prediction of Antimicrobial Resistance (AMR).

  • This package was used to determine ‘first isolates’, as per Hindler et al [4], for use in the final analysis; calculate and visualise AMR data; and predict future AMR rates using regression models.

Predicting future AMR rates

  • The AMR package [1,2] includes functions which, based on a date column, calculates cases per year and uses a regression model to predict antimicrobial resistance.

  • The resistance_predict() function creates a prediction model including standard errors (SE), which are returned as columns se_min and se_max.

  • Valid options for the statistical model (argument model) are: “binomial”, “poisson” and “linear”.

Totals

  • In total, 11098 distinct positive blood cultures were collected from 6888 distinct patients, leading to isolation of 12272 organisms.

  • Taking into consideration ‘first isolates’ only, 8780 distinct positive blood cultures were collected from 6888 distinct patients, leading to isolation of 9648 organisms.

  • From this point onwards, this analysis concentrates only on ‘first isolates’ from blood cultures, to intelligently de-duplicate the data

Location of blood culture collection

Location of blood culture collection

Organism data

Mean age of infection

Time series data

Distinct patients with positive blood cultures per month

Survival data

Survival data

Using R to predict AMR rates for a single antimicrobial

Using R to compare AMR rate-predictions between different antimicrobials

Using R to contrast predicted AMR rates in different patient populations

Using R to contrast predicted AMR rates in different patient populations

Using R to contrast predicted AMR rates in different patient populations

Using R to contrast predicted AMR rates in different patient populations

Thanks for listening

Daniel Weiand, Consultant medical microbiologist

Newcastle upon Tyne Hospitals NHS Foundation Trust

Email: dweiand@nhs.net

Twitter: @send2dan

NHS-R community blog: https://nhsrcommunity.com/author/daniel-weiand/

GitHub: send2dan

References

1
Berends MS, Luz CF, Friedrich AW, et al. AMR: An R package for working with antimicrobial resistance data. Journal of Statistical Software 2022;104:1–31. doi:10.18637/jss.v104.i03
2
Berends MS, Luz CF, Souverein D, et al. AMR: Antimicrobial resistance data analysis. 2023. https://CRAN.R-project.org/package=AMR
3
R Core Team. R: A language and environment for statistical computing. Vienna, Austria: : R Foundation for Statistical Computing 2022. https://www.R-project.org/
4
Hindler JF, Stelling J. Analysis and presentation of cumulative antibiograms: A new consensus guideline from the Clinical and Laboratory Standards Institute. Clinical infectious diseases 2007;44:867–73.